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Fifty years of genetic and molecular experiments have revealed a wealth of molecular interactions involved in the con- 
trol of cell division. In light of the complexity of this control system, mathematical modeling has proved useful in analyzing 
biochemical hypotheses that can be tested experimentally. Stochastic modeling has been especially useful in understand- 
ing the intrinsic variability of cell cycle events, but stochastic modeling has been hampered by a lack of reliable data on 
the absolute numbers of mRNA molecules per cell for cell cycle control genes. To fill this void, we used fluorescence in 
situ hybridization (FISH) to collect single molecule mRNA data for 16 cell cycle regulators in budding yeast, Saccharomyces 
cerevisiae. From statistical distributions of single-cell mRNA counts, we are able to extract the periodicity, timing, and mag- 
nitude of transcript abundance during the cell cycle. We used these parameters to improve a stochastic model of the cell 
cycle to better reflect the variability of molecular and phenotypic data on cell cycle progression in budding yeast. 



Introduction 

The eukaryotic cell division cycle is a complex process in 
which cells grow, replicate their DNA, and then partition their 
replicated chromosomes and organelles between 2 new daughter 
cells. These events are coordinated by a family of cyclin-depen- 
dent protein kinases (CDKs), whose activities are controlled by 
a complex network of interacting proteins. The budding yeast 
cell cycle is regulated by a single, constitutively expressed CDK 
(Cdc28), whose activity and substrate specificity change in a 
consistent manner during each stage of the cell division cycle. 1 " 5 
The activity of Cdc28 in budding yeast is regulated by periodic 
changes in the availability of protein cofactors, called cyclins, 
each of which is produced and degraded during a specific phase 
of the cell cycle. Kinases, phosphatases, and CDK-inhibitory 
proteins (CKIs) modify the activities of these cyclin— CDK com- 
plexes. The abundances of the cyclins and many other regula- 
tory proteins change during progression through the cell cycle 
because cyclin-CDK activity regulates their periodic synthesis 
and degradation. 

Like any gene or protein regulatory network, the cell cycle 
is a noisy process; in genetically identical populations, there is 



significant cell-to-cell variation even in a uniform environment. 6 
The sources of non-genetic variation in gene expression are a 
combination of intrinsic noise (the probabilistic nature of tran- 
scription and translation events resulting from the low abundance 
of transcription factors, their target genes, and transcripts) and 
extrinsic noise (such as inequalities in the partitioning of cellular 
components during division). 7 The abundance of a transcript or 
protein in a cell at any given time depends on the relative rates 
of synthesis and degradation and can be adequately described as 
a stochastic birth-death process with exponentially distributed 
waiting times. 6 

Mathematical modeling is now an integral component of cell 
cycle research. Computational biologists have developed mod- 
els that are consistent with diverse sets of experimental data. 8 " 17 
Models have helped predict previously unknown interactions 
that could be tested experimentally. 8 ' 13,14,18 " 24 The first genera- 
tion of models were "deterministic", relying on nonlinear ordi- 
nary differential equations (ODEs) to capture the viability (and 
other phenotypic characteristics) of dozens of mutants of the 
cell cycle control system. 25 However, other aspects of cell cycle 
progression, such as variability in interdivision time or cell size 
at division, cannot be explained by deterministic models. 12 To 
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account for these characteristics, along with the partial viabil- 
ity of some mutants, requires "stochastic" models of cell cycle 
progression. 8,14 

The first stochastic models of the cell cycle were derived 
from deterministic models by introducing extrinsic noise associ- 
ated with partitioning of cellular components during cell divi- 
sion, 26 by introducing Gaussian white noise into reaction rate 
equations, 27 or by modifying Boolean deterministic models. 28 " 32 
However, experiments in yeast cells with increasing gene dosage 
suggest that most noise in the cell cycle is a consequence of small 
copy number per cell of some molecular regulators, especially in 
Gj phase of the cell cycle. 14 The early phenomenological models 
of noise in the cell cycle are not adequate to analyze the role of 
intrinsic noise in cell cycle fluctuations, which requires explicitly 
modeling molecular interactions at their source. Initially, stochas- 
tic models focused on protein activity levels and protein-protein 
interactions, since they are the products of gene expression and 
the agents of cellular activity. 26,27,29,31 ' 32 By combining time-lapse 
microscopy, microfluidics, and image processing, it has been 
possible to observe in single cells the trajectories through the 
cell cycle of regulator proteins tagged with fluorescent protein 
domains. 14 ' 23,33,34 However, cell cycle control also involves a com- 
plex system of transcriptional regulation. In fact, very little gene 
expression noise can be explained solely in terms of protein syn- 
thesis and degradation. A dominant proportion of intrinsic noise 
is the low abundance of mRNA. 11 Therefore, to develop better 
predictive stochastic models of cell cycle regulation, one must 
account quantitatively for the fluctuations of mRNA abundances 
in individual cells. 

Accurate simulation of mRNA abundances, which are likely 
to be on the order of 5-10 molecules per yeast cell, requires the 
use of the Gillespie stochastic simulation algorithm (SSA), which 
is based on the assumption that the fluctuating reactions are 
pseudo-elementary, i.e., that they are adequately described by 
mass-action kinetics. 35 Because most deterministic models of cell 
cycle regulation rely heavily on complex rate laws (Michaelis— 
Menten kinetics, Hill functions, zero-order ultrasensitivity), they 
are not easily converted into mass-action-type models, suitable for 
simulation by Gillespie SSA. 36 To investigate the roles of intrinsic 
and extrinsic noise in cell cycle variability, Kar et al." performed 
this conversion on a simple deterministic model of the budding 
yeast cell cycle. Using Gillespie SSA, they showed that intrinsic 
noise dominates over extrinsic noise in determining the variabil- 
ity of cell size and cell cycle timing in budding yeast. However, 
to obtain the correct level of noise, they had to use mRNA abun- 
dances 4 times greater than the values reported on the basis of 
microarray studies." Also, they had to use mRNA half-lives 
(0.2-5 min) considerably shorter than the values reported on the 
basis of microarray studies. 11 

A year later, Barik et al. 10 presented a more realistic model of 
budding yeast growth and division, based on the idea that mul- 
tisite phosphorylation of key cell cycle regulators is the primary 
source of nonlinearity in the control system. Even though the 
overall effect of multiple phosphorylations on protein behavior 
is nonlinear (i.e., switch-like), each phosphorylation event can 
be described by a mass-action rate law. Hence, the model can be 



simulated deterministically, by solving the nonlinear ODEs, or 
stochastically, by applying Gillespie SSA. The deterministic ver- 
sion of the model accurately reflected GJS size control and tim- 
ing. The stochastic version accurately reproduced variability in 
cell cycle progression, provided that mRNA half-lives (1—5 min) 
are shorter than expected, 37 and mRNA levels (5-10 molecules 
per cell) are larger than expected. 38 The mRNA abundances in 
Barik's model 10 are in line with the values reported by Zenklusen 
and Singer 39 for non-cell cycle-related genes, based on FISH 
measurements. Because FISH data on cell cycle-related genes 
was lacking, it was impossible for Barik et al. 10 to calibrate their 
model accurately to the expected number and variability of regu- 
latory mRNA molecules per cell. 

In pursuit of more comprehensive and explanatory stochastic 
models of the cell cycle, we have used single mRNA FISH to 
measure the statistical distributions of transcripts for 16 cell cycle 
regulators in budding yeast. 3 ' We found that cell cycle regulators 
known to be constitutively expressed showed transcript abun- 
dance patterns consistent with a Poisson distribution. mRNA 
distributions of periodically expressed genes were best modeled 
as two-component Poisson distributions, reflecting their regu- 
lated high and low states of transcription. 40 We were also able 
to deduce the length of time in the cell cycle during which gene 
expression is elevated and the amplitude of the oscillations in 
gene expression. 

We then used this data to refine the multisite phosphoryla- 
tion model of Barik et al. 10 To match our observed distributions, 
we modified the model to allow complete dissociation of Whi5 
from SBF and of Netl from Cdcl4, and we adjusted transcrip- 
tion and translation rates to produce NET1 and WHI5 mRNA 
distributions that better matched our observations. Because the 
only periodically expressed transcript in this model refers to the 
Gj/S cyclins (Clnl, Cln2, Clb5, and Clb6, called "ClbS" in the 
model), we were able to reproduce the two-component Poisson 
distributions of CLN1 and CLN2 mRNAs in our data set, but 
not of other periodic transcripts. 

Results 

Statistical distributions of mRNA abundance 

Individual transcripts from 16 cell cycle genes were counted 
in single cells using single molecule FISH. 39 To this end, we 
used commercially available strains of Saccharomyces cerevisiae 
for which a GFP-coding sequence was inserted as an in-frame 
C-terminal fusion to a cell cycle control gene. A mixture of 5 
fluorescently labeled oligonucleotides, each containing 5 internal 
fluorescent dye molecules, was hybridized to the GFP portion 
of these fused transcripts (Fig. 1A). Spots were visualized and 
counted with a spot-detection algorithm (Fig. IB and C). 41 Cells 
lacking any GFP fusion elements were used as a negative control 
and showed virtually no fluorescent spots (Table 1), indicating a 
high specificity of the probes to the GFP transcripts. As a positive 
control, MDN1 transcripts were counted using probes against 
the coding region of MDN1 in a MDN1—GFP strain, and again 
using the same GFP probe set used for the other strains. Data 
from these 2 probe sets exhibited similar mRNA distributions, 
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and MDN1 mRNA abundances were consistent with previous 
reports (Table l). 42 

We then performed single mRNA FISH on asynchronous 
cultures of the 16 strains with GFP fused to a particular gene 
involved in cell cycle regulation. The average numbers of tran- 
scripts for each gene from one biological replicate are reported 
in Table 1. Similar results were obtained in 2 other biological 
replicates (Table SI). Our results, as well as those from other 
single mRNA FISH studies, 42 ' 43 consistently showed mRNA 
levels roughly 4-fold higher than large-scale transcriptome stud- 
ies (Table S2). This discrepancy is probably due to the method 
of normalization used in transcriptome studies to infer average 
transcript abundance per cell (for discussion of this issue, see 
Supplementary Material) . In fact, all transcriptome studies are 
in good agreement with our mRNA FISH data when they are 
normalized to 60000 transcripts/cell (Table S2). 



Test of transcriptional regulation 

The genes studied here are involved in the M/Gj and GJS tran- 
sitions. Half of them are known to be constitutively expressed, and 
the other half are transcriptionally regulated (Table 1; Fig. SI). 
Furthermore, the periodic genes represent most of the known 
expression mechanisms for cell cycle genes, with the expression of 
at least one gene peaking at every stage of the cell cycle (Fig. SI). 

We used maximum likelihood estimation to fit the mRNA 
data to a Poisson distribution. We found that BCK2, BUB2, 
CDC15, LTE1, MAD2, MCM1, and NET1 mRNAs all fit well to 
this distribution, confirming that these genes are constitutively 
transcribed (Fig. ID; Table 1; Table SI; Fig. S2). 44 ' 48 

Transcript data for 6 of the genes (CDC6, CDC20, CLB2, 
CLN1, CLN2, and SIC1) did not fit a Poisson distribution 
(Table 1; Table SI). Each of these distributions has a clearly visi- 
ble shoulder or tail in the histogram (Fig. ID; Fig. S2), consistent 
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Figure 1. Summary of single mRNA FISH method and mRNA distributions. (A) Schematic of how the FISH probes hybridize to target mRNAs. (B) Example 
image showing individual mRNA molecules. Image is a maximum intensity projection of a z-series with merged phase-contrast and fluorescence chan- 
nels. (C) Detected single transcripts. The image in the left panel was processed as described in "Materials and Methods" to detect spots of single mRNAs. 
Scale bar = 5 ^m. (D) Experimental distributions of mRNA for 16 cell cycle genes. Gray bars show the experimental histograms; red lines are the fit to a 
single component Poisson, and blue lines are the fit to a two-component Poisson. Data are from the biological replicate shown in Table 1. Similar results 
were obtained in 2 other biological replicates (Table SI). 
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with the hypothesis that these genes switch between 2 states of 
high and low transcription. Not surprisingly, all of these genes are 
annotated as periodically expressed (Table 1). We found that the 
distributions of mRNA abundance of these 6 genes are described 
well by two-component Poisson distributions, confirming that 



they have 2 expression states in the population (Fig. 1; Table 1; 
Table SI; Fig. S2). 

Similar results were obtained in 3 biological replicates 
(Table 1 ; Table SI) . There was generally good agreement between 
previous gene annotations and the outcomes of statistical tests to 



Table 1. Cell cycle mRNA distributions in asynchronous populations and derived gene expression parameters 



Gene 


Periodic 3 


FISH mean 
± 95% Cl b 


Single Poisson c 


Two-component Poisson d 








G test P 
value 


effect 
size 


P ± 95% CI 


X, ± 95% CI 


X, ± 95% CI 


G test P 
value 


<j) 2 effect size 


BY4733 (WTno GFP) 


No 


0.08 ± 0.02 
















MDN1 (MDN1 
probes) 


No 


9.37 ± 0.23 


0.02 


0.04 












MDN1 (GFP probes) 


No 


9.20 ± 0.27 


4.47E-05 


0.01 












BCK2 


No 


4.16 ±0.21 


0.78 


0.01 












BUB2 


No 


3.04 ±0.1 3 


9.87E-04 


0.04 












CDC 15 


No 


3.24 ±0.1 3 


0.18 


0.01 












LTE1 


No 


2.73 ± 0.09 


0.00 


0.10 












MAD2 


No 


2.50 ±0.10 


0.69 


4.22E- 
03 












MCM1 


No 


6.06 ± 0.22 


4.10E-03 


0.05 












NET1 


No 


6.20 ± 0.25 


0.60 


0.02 












CDC6 


Yes 


4.07 ± 0.20 


0.00 


0.99 


0.67 ± 0.07 


2.1 2 ±0.26 


8.08 ± 0.72 


0.03 


0.05 


CDC20 


Yes 


4.16 ±0.15 


0.00 


0.64 


0.63 ± 0.08 


2.47 ± 0.29 


7.08 ±0.61 


0.09 


0.02 


CLB2 


Yes 


3.01 ±0.10 


0.00 


0.74 


0.65 ± 0.07 


1 .55 ± 0.20 


5.69 ± 0.49 


1.20E-10 


0.07 


CLN1 


Yes 


3.03 ± 0.23 


9.99E-16 


0.47 


0.85 ± 0.07 


2.03 ± 0.28 


8.56 ± 1.53 


2.02E-03 


0.09 


CLN2 


Yes 


4.45 ±0.1 3 


0.00 


4.49 


0.77 ± 0.03 


2.09 ± 0.1 2 


12.4 ±0.55 


5.92E-06 


0.06 


ESP1 


Yes 


2.89 ±0.11 


7.23E-07 


0.05 


0.62 ± 0.26 


2.1 5 ±0.42 


4.09 ± 0.75 


0.11 


0.01 


SIC1 


Yes 


3.44 ±0.1 2 


0.00 


0.62 


0.84 ± 0.03 


2.47 ±0.1 5 


8.68 ± 0.79 


4.46E-03 


0.03 


TEM1 


Yes 


2.83 ±0.1 2 


0.10 


0.02 


0.53 ± 0.68 


2.1 7 ±0.89 


3.57 ± 1.09 


0.75 


3.19E-03 


WHI5 


Yes 


3.45 ±0.1 3 


1.11E-16 


0.15 


0.52 ±0.1 6 


2.1 2 ±0.41 


4.90 ± 0.57 


0.99 


9.89E-04 



Expression of genes are ranked as cell cycle periodic by Cyclebase 46 and by ref. 59; Transcripts/cell. The biological replicate from Figure 1 is shown. 
Similar results were obtained in other replicates (Table S1); 'Measured mRNA distribution fits to a Poisson distribution. G test of goodness of fit to a single 
Poisson distribution with X = FISH Mean. 4> 2 effect size. By convention, effect sizes of <S> 2 < 0.09 are considered small (green), 0.09 < 4> 2 < 0.25 are considered 
medium (orange), and 4> 2 > 0.25 are considered large (red). 63 Null hypothesis for the P value is that the empirical data have a single Poisson distribution 
with mean X; d Measured mRNA distribution fits to a two-component Poisson distribution. Same data set as for the single Poisson distribution. p\ fraction 
of the population with low transcript levels. X ]: mean transcripts/cell for the fraction of the population with low transcript levels. X 2 , mean transcripts/cell 
for the fraction of the population with high transcript levels. G test P values and (j> 2 as for single-Poisson fit, but the null hypothesis is that the data have a 
two-component Poisson distribution with means X 1 and X 2 . 
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differentiate constitutively expressed and transcriptionally regu- 
lated genes. In fact, there was an excellent correlation between 
whether a gene showed mRNA periodicity and whether the cor- 
responding protein oscillates (Table S3). 33 In this context, it is 
notable that although ESP1, TEM1, and WHI5 are ranked as 
periodic, their transcripts fit a single Poisson distribution in 2 of 
the 3 biological replicates, and neither Espl-GFP nor Teml-GFP 
display significant protein oscillations (Table S3). 33 
Timing and amplitude of mRNA oscillations 
We next asked if the proportion of highly expressing cells 
in the asynchronous population correlated with the proportion 
of the cell cycle in which the gene is expressed. We compared 
our FISH experiments to mean expression profiles in Cyclebase, 
derived from 6 different microarray experiments using synchro- 
nized cells (Fig. SI). For each periodic gene, we compared the 
fraction of cells with high mRNA abundance (Table 1) to the 
fraction of the cell cycle time in which gene expression is >50% 
of the peak level (full-width half-maximum; Fig. 2A— C). The 
good correlation between FISH and microarray data (Fig. 2C) 



indicates that the proportion of highly expressing cells does 
indeed correspond to the proportion of cell cycle time during 
which the gene is expressed. 

The two-component Poisson analyses produce mean transcript 
numbers for the highly expressing population (\ 2 ). It seemed 
reasonable to expect that the ratio between the mean transcript 
numbers for the highly expressing subpopulation and for the 
population as a whole would correlate with the peak expression 
level of the gene divided by its mean expression level in microar- 
ray experiments. In fact, comparison of the 2 data sets shows a 
strong correlation (Fig. 2D). Therefore, distribution analysis of 
single mRNA FISH yielded accurate values for the magnitude of 
change in gene expression for regulated genes. 

The amplitudes of CLB2 and SIC1 expression did not match 
in FISH and microarray data (the amplitudes are much higher in 
Cyclebase) . Consequently, these outliers were excluded from the 
correlation analysis reported in Figure 2D. Nevertheless, when 
these transcripts were included, the correlation of expression 
between FISH and microarray data was still good {R 2 = 0.38, 
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Figure 2. Correlation between the timing and magnitude of gene expression from FISH experiments and Cyclebase microarray experiments. (A) CLB2 
microarray data plotted on a linear scale with the minimum value set at 0 and the maximum set at 1. Used as an example of how we measure the length 
of time that the transcript level is greater than the half-maximum level (FWHM, full-width at half maximum). (B) CLB2 microarray data on a log 2 scale. 
Values are relative to the time-course mean. Used as an example of how we measure the peakimean value and the total cell cycle time. (C) Correlation 
plot for each cell cycle gene showing fraction of the high expression population from FISH experiments (1 - (3) compared with the fraction of cell cycle 
time in which transcript levels exceed FWHM level from Cyclebase. (D) Correlation plot for each periodic cell cycle gene showing the ratio of the mean 
transcript values of the high abundance population (\ 2 ) to the mean for the population from FISH experiments compared with the peakimean values 
from Cyclebase. Data are from the biological replicate shown in Table 1. Similar results are obtained with all biological replicates. Values for CLB2 and 
SIC1 are not included (see text). Some genes have two peaks of expression in the Cyclebase data, producing two data points. R 2 values and P values 
from ANOVA statistical analysis of the linear regressions are shown. Null hypothesis for the P value is that there is no correlation between the FISH and 
Cyclebase data. 
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P = 0.018). The mismatch for CLB2 and SIC1 may be attrib- 
utable to the methods used to synchronize cells and to register 
the data in Cyclebase. In the microarray experiments, the abun- 
dance of CLB2 and SIC1 transcripts is suppressed in the first 
portion of the expression plots (Fig. SI). For CLB2, expression 
levels are much lower at the beginning of the time-trace than 
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Figure 3. CLN2 mRNA distribution changes during cell cycle progression. Single mRNA FISH 
was performed on a-factor synchronized CLN2-GFP cells. These cells were chosen, because 
CLN2 expression is induced at the START transition and should be low in a-factor arrested cells 
(G,). Cells were released from arrest (0 min) and allowed to proceed through the cell cycle. 
Samples were removed from the culture at the indicated times and processed for FISH, FACS 
analysis of DNA content, and to determine bud morphology as benchmarks for cell cycle pro- 
gression. Left panel: CLN2 mRNA cellular abundance distributions as determined by FISH. Also 
shown are single Poisson distributions (red line) and two-component Poisson distributions 
(blue line) fitted to the mean mRNA abundances. The Poisson parameters and statistics are 
provided in Table 2. Middle panel: DNA content as determined by FACS analysis with the pro- 
portions of 1N and 2N cells indicated as percentages. Percentages are not 100% because of 
incompletely stained cells that are not within the peaks. Right panel: the proportion of unbud- 
ded (G,) cells, small-budded (<1/3 size of mother; S-phase) cells, large-budded cells with one 
nucleus (G 2 /M), and large-budded cells with an elongated nucleus in the neck or in which both 
mother and bud contain a nucleus (anaphase/telophase). 



the minimum attained in the subsequent cell cycle, despite the 
fact that CLB2 levels should be high but decreasing in M phase. 
Similarly, SIC1 levels are artificially low at the start (the initial 
M phase), but this is clearly an artifact of how Cyclebase reg- 
isters the experiments using a factor or a temperature-sensitive 
cdc28 allele (both of which arrest cells in Gj). By contrast, experi- 
ments using a temperature-sensitive cdcl5 
allele, which arrests the cells in late M, did 
show an initial peak (Fig. SI). The lack of a 
starting peak for Gj-arrested cells is likely due 
to the fact that the "initial M phase" of the 
time-registered experiments actually occurs at 
the end of the second cell cycle, by which time 
synchrony is reduced. The underestimated 
expression levels for the first part of the time- 
course data for CLB2 and SIC1 result in lower 
overall means and inflated relative peak values 
for these genes. The synchronization method 
is not a problem for the other periodic genes, as 
they are expressed earlier in the cell cycle than 
CLB2 and SIC1, before synchrony degenerates, 
and show 2 clear peaks in mRNA abundance, 
which produce more accurate overall means 
(Fig. SI). So, it appears possible that Cyclebase 
overestimates the magnitude of expression 
(peak: mean ratio) for CLB2 and SIC1. 

Another possible explanation of the dis- 
crepancy is that our FISH data underestimated 
these oscillations due to saturation of the spot 
detection algorithm. We do not think this 
was the case, because 3 genes (CLN1, CLN2, 
CDC20) had higher absolute numbers of tran- 
scripts per cell than CLB2 and SIC1, and these 
genes correlated well with the Cyclebase data 
(Table 1; Fig. 2D). 

To demonstrate that the low and high 
expressing populations in the FISH results cor- 
respond to cell cycle periodicity of expression, 
we performed a time course experiment using 
synchronized CLN2-GFP cells and moni- 
tored CLN2 mRNA levels relative to cell cycle 
progression, as determined by bud morphol- 
ogy and DNA content. As expected, CLN2 
mRNA abundance was low in cells arrested 
in Gj by treatment with a-factor, and the sin- 
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gle-cell mRNA levels fit a single-component 
Poisson distribution reasonably well (Fig. 3; 
Table 2). Within 30 min after release from G l 
arrest, the proportion of cells expressing high 
levels of CLN2 mRNA increases, the mean 
transcript abundance of the high expressing 
population increases, and the mRNA abun- 
dances now fit two-component Poisson dis- 
tributions (Fig. 3; Table 2). The shift in the 
cell population to higher CLN2 mRNA abun- 
dance corresponded to S-phase entry as shown 
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by a corresponding shift of DNA content from IN to 2N and by 
the appearance of small-budded cells (Fig. 3). A small population 
of high expressing cells persisted at later times, probably due to 
imperfect synchronization, since there were G,- and S-phase cells 
throughout the experimental time course (Fig. 3). 

A second peak in CLN2 expression occurred at the 80 min 
time point as the cells entered a second cell cycle (Fig. 3; Table 2). 
In fact, CLN2 expression appeared to be more robust in the sub- 
sequent cycle despite a clear degradation in synchrony compared 
with the 30 min time point, suggesting that a-factor treatment 
might initially suppress the accumulation of CLN2 mRNA after 
cells are released. 

Surprisingly, the proportion of cells that have no CLN2 
mRNA was a minor portion of the population throughout the 
time course, and the mean CLN2 mRNA levels of the low express- 
ing population remained relatively constant (Fig. 3; Table 2). 
The persistence of low expressing cells might be due to imper- 
fect synchronization of the population, since there was a signifi- 
cant population of G-S cells even 60 min after release (Fig. 3). 
However, the proportion of low expressing cells was much higher 
than the fraction of G—S cells (e.g., 75% low expressing cells but 
only -35% Gj-S cells at 60 min; Fig. 3), indicating that there is 
measurable CLN2 basal transcription throughout the cell cycle. 

Comparing mRNA and protein dynamics 

Although we are examining only a small set of cell cycle-reg- 
ulatory genes in this study, our results suggest that regulation 
of mRNA abundance is an excellent predictor for the regulation 
of protein abundance (Table S3). However, as revealed in mul- 
tiple transcriptome-proteome studies, mRNA and protein steady- 
state levels are usually not well correlated (R 2 values are typically 
between 0.2-0.5) . 49 To better understand the influence of mRNA 
dynamics on protein dynamics, we decided to test if there were 
any correlations between the amplitudes of mRNA and protein 
oscillations and between mRNA and protein abundances. 



Unfortunately, we were unable to compare mRNA and pro- 
tein abundances in the same cells, because GFP fluorescence 
is not well preserved using the single mRNA FISH protocol. 
However, we previously described protein dynamics for the same 
16 GFP-tagged strains, 33 and we were able to compare the mRNA 
data from FISH to these protein data. We found that there was 
no significant correlation between the amplitudes of mRNA and 
protein oscillations (Fig. 4A, P = 0.2, null hypothesis is zero cor- 
relation). This poor correlation of mRNA to protein amplitudes 
probably reflects individual rate differences between synthesis 
and degradation of mRNA compared with protein. Furthermore, 
when we compared mean population levels of mRNA to pro- 
tein (values are listed in Table S3), there was a mildly significant 
correlation reminiscent of transcriptome-proteome comparisons 
(Fig. 4B, black regression line, P = 0.03). 49 However, we found 
very strong correlations between mRNA and protein abundances 
when we grouped the data by constitutive or periodic mRNA 
(Fig. 4B, blue regression line P = 1.8E-4 and red regression line 
P = 1.4E-3, respectively). We were able to use the protein popu- 
lation means for these correlations, because all of the proteins 
exhibit normal or Poisson distributions and do not consist of 
widely distinct populations expressing low or high levels of pro- 
tein as there is for mRNA (Fig. S3). 

Although separating the genes by their regulation showed 
good mRNA: protein correlations, the slope of the linear regres- 
sion for the periodic genes was much higher than that of the 
constitutive genes (Fig. 4B, red regression line m = 354.4 and 
blue regression line m = 57.2, respectively). There is no reason 
to suppose that translation efficiency is so exquisitely sensitive 
to mRNA concentration differences of a few molecules per cell, 
since translation rates are largely governed by initiation and elon- 
gation kinetics. 50,51 Instead, we postulated that the high slope of 
the correlation for periodic genes was due to the fact that most 
protein is produced in the high transcription state, and using 



Table 2. Cell cycle-dependent CLN2 mRNA distributions in synchronized cells and derived gene expression parameters 



Cell cycle stage of CLN2 cells 


FISH mean 
± 95% Cl a 


Single Poisson b 


Two-component Poisson' 


G test P 
value 


4> 2 effect 
size 


P ± 95% CI 


\ ± 95% CI 


k 2 ± 95% CI 


G test P 
value 


<|> 2 effect 
size 


asynchronous from Table 1 


4.45 ±0.1 3 


0.00 


0.77 ± 0.03 


2.09 ±0.1 2 


12.4 ±0.55 


5.92E-06 


0.06 


asynchronous 


5.20 ±0.11 


0.00 


18.08 


0.70 ± 0.02 


1.29 ±0.08 


14.46 ±0.41 


0 0.00 


0.45 


0 min a-factor release 


2.94 ±0.11 


0.00 


0.15 


0.78 ±0.1 4 


2.25 ± 0.30 


5.46 ± 1 .09 


0.03 


0.02 


15 min a-factor release 


2.79 ±0.11 


0.00 


2.78 


0.83 ± 0.03 


1.1 9 ±0.09 


10.86 ±0.62 


1 .64E-07 


0.08 


30 min a-factor release 


8.42 ±0.19 


0.00 


23.77 


0.64 ± 0.03 


2.33 ±0.1 5 


19.10 ±0.54 


0.00 


0.76 


40 min a-factor release 


6.61 ±0.16 


0.00 


12.23 


0.74 ± 0.03 


2.54 ±0.14 


18.13 ±0.66 


0.00 


0.57 


60 min a-factor release 


6.61 ±0.19 


0.00 


11.51 


0.75 ± 0.03 


2.64 ±0.1 6 


18.66 ±0.77 


0.00 


0.54 


80 min a-factor release 


12.59 ±0.23 


0.00 


15.09 


0.46 ± 0.03 


3.14 ±0.23 


20.68 ± 0.47 


0.00 


0.92 



a Transcripts/cell. Data are from cells shown in Figure 3; b Measured mRNA distribution fits to a Poisson distribution. G test of goodness of fit to a single 
Poisson distribution with \ = FISH Mean. <|> 2 effect size. By convention, effect sizes of <j> 2 < 0.09 are considered small (green), 0.09 < <j> 2 < 0.25 are considered 
medium (orange), and 4> 2 > 0.25 are considered large (red). 63 Null hypothesis for the P value is that the empirical data have a single Poisson distribution 
with mean \; 'Measured mRNA distribution fits to a two-component Poisson distribution. Same data set as for the single Poisson distribution. [3, fraction 
of the population with low transcript levels. \ ]7 mean transcripts/cell for the fraction of the population with low transcript levels. k 2 , mean transcripts/cell 
for the fraction of the population with high transcript levels. G test P values and <|> 2 as for single-Poisson fit, but the null hypothesis is that the data have a 
two-component Poisson distribution with means X, and k r 
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lower population mRNA means overestimates the number of 
proteins/transcript. Therefore, we compared the protein abun- 
dance to the mRNA means of the high expressing population 
for the periodic genes (A., from Table S3). This results in a linear 
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Figure 4. Correlations of mRNA oscillation amplitudes and abundances 
to protein. Data for all correlation plots is from Table S3. All protein data 
are derived from Ball et al.33 (A) Correlation plot for each periodic cell 
cycle gene showing mRNA amplitudes from FISH experiments (X - X,) 
compared with protein oscillation amplitudes. Only genes that showed 
both mRNA and protein oscillations are included. Black line is the linear 
regression. (B) Correlation plot of all 1 6 cell cycle genes, showing the ratio 
of the mean transcript abundance from FISH experiments to the mean 
protein abundance, as measured by fluorescence (mean pixel intensity/ 
cell). The linear regression and statistics for the entire set (black), for the 
constitutive genes only (blue) and for the periodic genes only (red) are 
shown. (C) Correlation plot of all 1 6 cell cycle genes. Same as (B), except 
that instead of the mean mRNA abundance for the population, the mean 
of the high abundance population (X 2 ) is used for the periodic genes, ft 2 
values and P values from ANOVA statistical analysis of the linear regres- 
sions are shown. Null hypothesis for the P value is that there is no correla- 
tion between the FISH and protein data. 



regression with the same slope as that for the constitutive genes 
and slightly fewer proteins/mRNA in the group of periodic genes 
compared with the constitutive genes (Fig. 4C) 

All of the oscillating proteins in this study are known to be 
degraded via regulated ubiquitination and the 26S proteasome. 52 
Moreover, recent genomic, transcriptomic, and proteomic analy- 
ses have found that oscillating cell cycle proteins use sub-optimal 
codons to regulate translation in conjunction with cell cycle 
changes in tRNA availability and translation initiation rates. 50,51 
Therefore, the lower protein-to-mRNA ratio in periodic genes 
could be because the oscillating proteins have less efficient trans- 
lation, enhanced degradation rates, or a combination of both. 

These correlations also address the potential for the GFP tag 
to alter mRNA or protein dynamics. The good agreements of 
mRNA with protein levels suggest that GFP does not generally 
affect mRNA or protein synthesis or stability in this set of genes. 

Including mRNA in a stochastic model of the cell cycle 

We have previously described a multisite phosphorylation 
model of the budding yeast cell cycle, 11 which accurately reca- 
pitulates experimental observations regarding natural varia- 
tions in cell cycle timing and cell volume (see Supplementary 
Material for a brief description of the model). The model made 
certain assumptions about the mean levels of mRNA abundances 
of cell cycle regulators, because at the time there were no data 
on mRNA abundances in single cells for these genes. In light of 
our measurements of these mRNA distributions, we decided to 
revisit the model and assess its predicted mRNA distributions. 

First of all, we compared our observed distribution of CLN2 
mRNA (Fig. ID) to our stochastic simulations of MbS (mRNA 
for GJS cyclin CLBS) in our original model (Fig. 9B in ref. 10). 
Our simulations predicted a much larger proportion of cells with 
high levels of CLN2 mRNA. To bring the model in line with our 
observations, we reasoned that the production of CLN2 mRNA 
has to be more abrupt, producing a transient peak in abundance 
to match the observed distribution of CLN2 mRNA. In order to 
produce a more abrupt burst of CLN2 mRNA, we allowed phos- 
phorylation of terminal complex (CmpP 2 = complex of Whi5P 2 
and SBF) by Cln3 and ClbS-dependent kinase, so that Whi5P 2 is 
phosphorylated to Whi5P 3 , and SBF is released completely from 
the complex. We also modified the model to allow for phos- 
phorylation of the RENT 5 complex, leading to complete release 
of Cdcl4. These simple changes resulted in steeper and more 
realistic peaks in SBF activity, Cdcl4 release, and CLN2 mRNA 
(MbS) abundance (compare Fig. 6 to Figs. 3 and 5 in ref. 10). 

In addition, the distributions of 2 constitutive mRNAs in our 
model, WHI5 and NET1 transcripts, were shifted slightly to the 
right compared with our experimental observations. Therefore, 
to reproduce the observed distributions of WHI5 and NET1 tran- 
scripts, we reduced the mRNA production rates for these 2 genes 
and increased protein synthesis rates, such that total protein lev- 
els remained the same. 

With these small adjustments, the model now captures quanti- 
tatively the observed distributions (Fig. 5) and quantities (Table 3) 
of CLN2, WHI5, and NET1 transcripts and the time-course of 
expression of the CLN2 gene (compare Fig. 6C to Fig. SI). All 
adjustments to the model are listed in Tables S3 and S4. 
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In addition to fitting the FISH data, simulations of the model 
with the new parameters corresponded well with the statistics of 
cell populations collected by Di Talia et al. 14 Examining the tim- 
ing of Gj phase, we see a size-dependent component (slope -0.55) 
for small birth size daughters and a size-independent component 
(slope -0.22) for larger daughters (Fig. S4); the corresponding 
values in Di Talia et al. 14 are -0.7 and -0.3. In our simulations, 
the smallest quartile of mother cells show a weak, size-dependent 
component to the time spent in G t (slope -0.18), but most moth- 
ers show no size dependence (slope -0.04; Fig. S4). Although Di 
Talia et al. 14 obtained a single straight line with slope -0.1, the 
small amount of size control in mothers we observe in our new 
model is a marked improvement over our previous result (slope 
-0.36). 10,14 Stochastic simulations with our new model produce 
cell cycle timing and cell volume distributions that correspond 
well with published observations (Table 4). Notice that the great- 
est variations occur in the timing of G,, when the number of 
molecules per cell is lower than in later phases of the cell cycle. 

Having incorporated our mRNA observations in the model 
simulations, we wanted to determine how using the mRNA 
FISH data affected the outcome of stochastic simulations in 
comparison to using mRNA abundances reported by microarray 
studies. We ran simulations and collected statistics for popula- 
tions in which each cell on average had 4-fold fewer numbers 
of mRNAs for all the genes included in the model. Although 
the average properties of cell cycle metrics remain more-or-less 
the same, the variability of these metrics increases significantly 
(Fig. 7). These results highlight the significance of mRNA levels 
in determining the amplitude of intrinsic noise and the impor- 
tance of obtaining absolute mRNA quantities for stochastic mod- 
eling of gene expression. 

Discussion 

Recent attempts to build realistic, accurate, stochastic models of 
the CDK control system in budding yeast have been hampered by 
the lack of reliable data on the distribution of transcript numbers 
for cell cycle control genes in individual yeast cells. Computational 
models indicated that variability in cell cycle progression is 
extremely sensitive to the average number of gene transcripts per 
cell and to the expected half-lives of these mRNA molecules. Data 
available at the time, from yeast transcriptome studies, suggested 
that the average number of transcripts per cell was -1 3S and that 
mRNA half-lives were -20 min, 37,38,53 and more recent studies have 
supported these values. 54,55 Using these numbers, one can easily 
estimate that fluctuations in protein abundances per cell would be 
expected to be ± 50-100%, which is likely inconsistent with reli- 
able behavior of any molecular control system." 

Later measurements of mRNA transcripts for a handful of 
non-cell cycle control genes in single yeast cells by fluorescence 
in situ hybridization (FISH) suggested that the whole-tran- 
scriptome measurements might underestimate average mRNA 
abundances by -4-fold, 42 and a subsequent microarray study 
normalized according to this observation gave quantities of tran- 
scripts per cell in line with the mRNA FISH measurements. 56 
Furthermore, 20 min half-lives for mRNAs involved in cell cycle 



control seem to be inconsistent with how fast gene expression is 
turned on and off during normal cell cycle progression and in 
response to regulated promoters. In general, transcripts encoding 
regulatory proteins have shorter mean half-lives than transcripts 
coding for central metabolic proteins. 37 More specifically, recent 
transcriptome analyses indicate that the transcripts for cell cycle 
genes compose a large proportion of the least stable transcripts. 54 
In fact, the half-lives for the 16 cell cycle genes included in the 
current study fall in the range of 5-10 min when measured by 
Dynamic Transcriptome Analysis (Table SI). 54,56 

To help resolve uncertainties in mRNA quantities, we set out 
to measure the absolute quantities and distributions of mRNA 
transcripts of 16 cell cycle control genes in single yeast cells by 
FISH. Our data confirmed that the whole-transcriptome esti- 
mates are too low by -4-fold. In addition, we were able to extract 
information about the dynamics of gene expression from a "snap- 
shot" of the mRNA distribution in an asynchronous culture of 
yeast cells. Constitutive genes fit a single Poisson distribution, 
while periodic genes generally fit a two-component Poisson dis- 
tribution. The low P values for some constitutive genes using the 
Poisson model (e.g., MCM1) and also some periodic genes for the 
two-component Poisson model (e.g., CLN1) suggest that these 
distributions may not fit the observed data exactly. However, the 
effect sizes (4> 2 in Table 1; Table SI), Q-Q plots (Fig. S2), and 
density overlays (Fig. 1) suggest that the one- and two-compo- 
nent Poisson models generally fit the constitutive and periodic 
data well. Nevertheless, our observations that the mRNA distri- 
butions for some genes do not exactly fit single or two-compo- 
nent Poisson distributions might be a consequence of using an 
oversimplified two-state model for transcription. 

Comparisons between mRNA and protein also yielded some 
interesting observations. We found that transcript periodicity 
corresponds to protein oscillations, and that constitutively tran- 
scribed genes usually produce constitutive protein levels. These 
might appear to be obvious correlations, but it is theoretically 
possible that a constitutively transcribed gene could produce 
protein oscillations via post-transcriptional mechanisms, or that 
these same mechanisms could also smooth out mRNA periodic- 
ity to yield constitutive protein levels. These 2 situations could 
arise in response to rapid developmental/environmental changes 
or as evolutionary intermediates as a gene is utilized for a new 
function, but we are not likely to see discordant mRNA and pro- 
tein regulation during exponential growth in such ancient regula- 
tory networks as those controlling the cell cycle. 



Table 3. mRNA quantities from stochastic simulations of the model com- 
pared with observed quantities 



Gene 


Mean 3 






Model 


Expt 


Model 


Expt 


CLBS 


6.2 


4.5 


1.19 


1.14 


NET1 


6.8 


6.2 


0.21 


0.43 


WHI5 


4.2 


3.5 


0.49 


0.67 



"Data from the biological replicates shown in Table 1. Similar results 
are obtained with all replicates; b Coefficients of variation (CV = standard 
deviation/mean). 
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Table 4. Population statistics of stochastic simulations compared with experimental values 



Quantity 3 


Daughter cell 


Mother cell 




Mean (model) 


Mean (expt) 


CV b (model) 


CV (Expt) 


Mean (model) 


Mean (expt) 


CV (model) 


CV (expt) 


In 

div 


118.0 


112.0 


0.18 


0.22 


91.3 


87.0 


0.20 


0.14 




40.6 


37.0 


0.43 


0.50 


13.5 


16.0 


0.47 


0.50 


"^SG2M 


77.5 


76.0 


0.21 


0.20 


77.8 


72.0 


0.23 


0.17 


v 

birth 


31.3 


28.0 


0.29 


0.22 


46.9 


40.0 


0.29 


0.18 


div 


70.9 




0.25 




89.1 




0.30 





a T,. , time from birth to division; J r „ time from birth to bud formation; T cr „„ time from bud formation to division; V. . „,., dauqhter cell volume at birth; V 

div* ' Gr ' SG2M ' birth' 3 ' c 

mother cell volume at division; b Coefficients of variation (CV = standard deviation/mean). 



Although there was no significant correlation between the 
amplitudes of mRNA and protein oscillations, mRNA abun- 
dances agreed very well with protein abundances. For constitu- 
tively expressed genes and periodic genes, the mean abundances 
of mRNA and protein correlated well, but only when grouped 
separately. On the other hand, using the transcript abundance 
of the high-expressing population (\ 2 ) produced a correlation 
for periodic genes that fit well with the constitutive genes. These 
observations demonstrate one of the caveats of transcriptome- 
proteome correlations — using population means produces a poor 
mRNA-protein correlation when combining constitutive and 
regulated genes in the analysis, which is why various method- 
ologies, such as principal component analysis, must be used to 
identify groups of differently regulated genes. 

Three genes (ESP1, TEM1, and WHI5) that Cyclebase labels 
as periodic exhibit mRNA distributions that are marginally fit by 
a single Poisson distribution. Examining their expression profiles 



in Cyclebase (see Fig. SI), we find that the data are quite noisy, 
that the changes in transcript abundance are quite small (-2- 
fold ratio between \j and X 2 ), and that the majority of cells are 
grouped in the higher abundance population ((3 < 0.5), reflecting 
their broad peaks of expression (Table 1; Table SI). Furthermore, 
in a previous study, we similarly found the Espl and Teml pro- 
teins do not show significant oscillations. 33 Although these 
genes may indeed be weakly periodic, modeling their expression 
as constitutive produced an excellent fit of our simulations to 
the available experimental data. In addition, when performing 
mRNA-protein comparisons, it made no significant difference if 
we grouped ESP1, TEM1, and WHI5 with constitutive genes or 
with periodic genes (data not shown). 

Using these measurements of mRNA distributions of cell cycle 
control genes, we were able to significantly improve our published 
model of the budding yeast cell cycle. 10 The observed distributions 
of CLN1 and CLN2 transcripts suggested a major change in how 
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Figure 5. mRNA distributions from stochastic simulations of the model compared with observed distributions. (A-C) Comparison of experimentally 
observed mRNA abundance to simulated distributions for (A) CLN2, (B) NET1, and (C) WHI5. Experimental data are from biological replicates shown in 
Table 1. Similar results are obtained with all biological replicates. 
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we modeled the phosphorylation of Whi5:SBF complexes and of 
Netl:Cdcl4 complexes. The observed distributions of other genes 
prompted minor changes in the rate constant values for gene tran- 
scription and translation. After these changes were made, not only 
was our model brought into alignment with our measured gene 
expression profiles, but also it was more successful in fitting the 
observed distributions of other cell cycle metrics. Although the 
mRNA half-lives used in this model (1—5 min) are shorter than 
measured values (5-10 min), 54,56 the discrepancy might easily be 
accounted for by some degree of mRNA gestation and senescence, 
as explained in the supplement to Barik et al. 10 A new version of 
the model (under development), will include periodic expression 
of CLB2, SIC1, CDC6, and CDC20, using the data collected 
here to determine the correct mRNA distributions in the model. 
Furthermore, mRNA half-lives in the new model (2—10 min) are 
more in line with the best recent measurements. 54 ' 56 

Our results show the benefits of absolute quantification of 
molecules in single cells. Because our approach 
counts mRNA molecules, we avoid the pitfalls 
of converting inherently relative data to absolute 
quantities by an intrinsically uncertain normal- 
ization factor. Remarkably, in addition to the 
benefits of having single-cell mRNA abundance 
measurements for mathematical modeling of 
gene expression, this data also provided insights 
into the dynamics of gene expression during cell 
cycle progression. We were able to determine if 
genes were constitutively expressed or regulated, 
and for the regulated genes, we were able to 
deduce the fraction of cell cycle time in which 
these genes are expressed and the magnitude of 
change in expression — all from analysis of static 
mRNA distributions in unperturbed asynchro- 
nous cells. 

Although our experimental design precluded 
us from correlating transcript numbers to cell 
cycle phase, asynchronous cells can be staged 
using morphological markers such as the pres- 
ence of a bud, myosin and/or Whi5 localization, 
and microtubule structures. 14 ' 57 Including such 
markers in automated analyses and correlat- 
ing transcript numbers using 2-color FISH 42 or 
fluorescent protein labeling of mRNA 58 could 
produce high-definition data sets that include 
quantitative temporal information on single- 
cell behavior, for hypothesis testing and sto- 
chastic modeling of gene expression networks. 

Materials and Methods 



integration of the GFP tag by PCR-amplification of genomic 
DNA and growth characteristics, and protein dynamics were 
normal, suggesting that the fusions were functional. 33 Cells were 
grown to an OD 6()0 of 0.4 in synthetic complete (SC) growth 
medium with 2% glucose at 30 °C prior to fixation and hybrid- 
ization of probes. 
Cell cycle analysis 

For cell cycle synchronization experiments with the CLN2- 
GFP strain, cells were grown as described above and arrested 
in Gj by the addition of 20 uM ot-factor. Cell cycle arrest was 
monitored by removing 500 uL of culture, sonicating the cells 
(Branson Sonifier with microprobe, 10 s at power setting 4), and 
observing cells under differential interference contrast (DIC) 
microscopy until >95% of cells are unbudded with a small mat- 
ing projection (-60 min). To release the cells from arrest, cells 
were harvested by centrifugation, washed once in sterile water, 
and resuspended in warm SC medium. 



Yeast strains and media 

Each S. cerevisiae strain used in this 
study included GFP inserted as an in-frame 
C-terminal fusion with a gene involved in 
cell cycle control. 59 All strains were purchased 
from Invitrogen. Strains were tested for correct 
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Figure 6. Oscillatory dynamics of key proteins and mRNAs of the model. (A-C) Deterministic 
simulations of the model showing time-course plots of protein concentrations (A and B) 
and mRNA abundances (C). (D-F) Stochastic trajectories of the same species are shown as 
in panels (A-C). In deterministic simulations, cell volume is divided equally at cell division 
(indicated by the arrows at the bottom of each plot). In stochastic simulations, cell volume is 
divided 60:40 (mother:daughter) at cell division. 
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At various time points before and after synchronization, we 
removed a 60 mL sample with 50 mL to be processed for FISH, 
as described below, and 10 mL to be processed for fluorescence- 
activated cell sorting (FACS) analysis of DNA content and for 
bud morphology analysis (budding index). The 10 mL sample 
was centrifuged and washed twice in sterile nuclease-free water. 
The cell pellet, consisting of approximately 2 x 10 8 cells, was 
resuspended in 1 mL of water and divided equally into 2 micro- 
centrifuge tubes. The cells were sonicated, and an equal volume 
of absolute ethanol was added to fix the cells, which were stored 
at 4 °C overnight (or longer). Fixed cells were washed twice in 
sterile nuclease-free water before processing for FACS or budding 
index. 

Samples for budding index analysis were resuspended in 1 
mL of sterile nuclease-free water and stained with 2 ug/mL final 
concentration of Hoechst 33342 (Invitrogen, http: //products. 
invitrogen.com/ivgn/product/H1399) for 2 h. Cells were then 
briefly sonicated, washed once in sterile nuclease-free water, 
and wet-mounted on slides for observation on a fluorescence 
microscope as described below except that single focus images 
were acquired. After acquisition, image files in TIFF format 
were automatically converted to Omero files and uploaded to an 



image storage server using Omero Insight's importer. 60 We used 
composite images of the phase-contrast and Hoechst chan- 
nels to score unbudded G t cells, small-budded S-phase cells 
in which the bud is <l/3 the size of the mother, large-budded 
G 2 /M cells with a single aggregate of DNA (nucleus), and large- 
budded anaphase/telophase cells with an elongated nucleus in 
the mother-bud neck or with mother and bud both containing a 
single nucleus, respectively. 

Samples for FACS analysis of DNA content were sonicated, 
pelleted, and resuspended in 1 mL of RNase A (Sigma- Aldrich) 
solution (50 mM Tris-CL, pH 8.0, 15 mM NaCl, 2 mg/mL 
RNase A), and incubated at 37 °C for 4 h. 61 After incubation, 
proteinase K (New England Biolabs) was added directly to the 
solution to a final concentration of 200 ug/mL, and the tube 
was incubated at 37 °C for a further 1 h. Cells were then cen- 
trifuged, resuspended in 50 mM Tris-Cl, pH 8.0, and stained 
for immediate FACS analysis or stored at 4 °C. DNA staining 
was accomplished by adding 100 uL of cells to 1 mL of SYTOX 
green (Invitrogen, http://products.invitrogen.com/ivgn/prod- 
uct/S7020) solution (1 uM SYTOX Green in 50 mM Tris-Cl, 
pH 8.0). The stained cells were briefly sonicated, and FACS 
analysis was performed on a BD LSR II using the Argon laser 
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Figure 7. Low mRNA abundance increases cell cycle variability. In these 4 panels we compare the means and coefficients of variation (CV, standard devi- 
ation/mean) of 5 cell cycle metrics (T„. . interdivision time; T„ duration of unbudded phase; T„,„, duration of budded phase; V„. ... size at birth; V„. . . , 
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size at division) for mother cells and daughter cells. Blue bars, experimental measurements (from ref. 14); green bars, model simulations as described in 
the text; red bars, model simulations with 4-fold reduction in the average numbers of transcripts of all genes in the model. 
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! nm) and FITC filter (505 LP splitter and 525/50 BP filter). 
Data was collected and analyzed with BD FACSDiva 7.0. 
FISH probes 

Five, 50mer oligonucleotides were designed to target the 
GFP moiety of mRNA transcribed from the GFP-tagged genes 
(Supplementary Methods; Table 1). The oligos were synthesized 
commercially (BioSearch Technologies), and each oligo con- 
tained 5 modified Ts that carried an amino group. A DyeLight 
550 NHS Ester labeling kit (Thermo Scientific, http://www. 
piercenet. com/browse. cfm?fldID = E3F90446-A514-E35F- 
2A86-0F053949F201) was used to conjugate the fluorescent dye 
DyeLight 550 to the modified Ts. 

List of oligonucleotides used in this study 

See Table 5. 

Hybridization 

Hybridization was performed as described. 42 Briefly, cells 
were first fixed by incubation in 4% (v/v) paraformaldehyde 
(Electron Microscopy Sciences, http://www.emsdiasum.com/ 
microscopy/products/chemicals/formaldehyde. aspx) for 45 
min. After washing 3 times in Buffer B (1.2 M Sorbitol, 100 
mM KHP0 4 , pH 7.5), cells were resuspended in spheroplast 
buffer (1.2 M Sorbitol, 100 mM KHP0 4 pH 7.5), 20 mM 
Ribonucleoside-Vanadyl complex (New England Biolabs, 
https://www.neb.com/products/sl402-ribonucleoside-vana- 
dyl-complex), 20 mM (3-mercaptoethanol, and 25 U lyticase 
(Sigma-Aldrich, http : //www.sigmaaldrich. com/catalog/ prod- 
uct/sigma/12524), and incubated at 30 °C for 7 min to par- 
tially digest the cell walls. Cells were placed on poly-L-lysine 
(Sigma-Aldrich) coated coverslips and allowed to settle for 30 
min at 4 °C. The coverslips were washed once with Buffer B to 
remove unattached cells and stored overnight in 70% ethanol. 
The ethanol was removed, and the coverslips were incubated 
twice in 2x saline-sodium citrate (SSC) for 5 min and once in 
40% formamide/2x SSC. Coverslips were then incubated with 
a mixture of the 5 probes (0.4 ng each, 2 ng total) overnight at 
37 °C. Unbound probe was removed by washing the coverslips 
twice with 40% formamide/2x SSC, once with 2x SSC/0.1% 
Triton, and once with lx SSC. The coverslips were then washed 
in a Hoechst 33342 solution to stain nuclei and mounted on 
slides with Prolong Gold mounting medium (Life Technologies, 
http://products.invitrogen.com/ivgn/product/P36930). 

Imaging 

All images were collected on an Axio Observer Zl (Zeiss, 
http://microscopy.zeiss.com/microscopy/en_de/products/ 
imaging-systems/cell-observer.html) microscope equipped with 
a CoolSNAP HQ 1 CCD camera (Photometries, http://www. 



photometrics.com/products/ccdcams/coolsnap_hq2.php), a hal- 
ogen lamp for bright field imaging, and a 120 W metal halide 
lamp for fluorescence excitation. The microscope is controlled by 
custom software developed in MATLAB that relies on the API of 
the open-source microscopy control software, u,Manager. 62 The 
custom software performs image-processing tasks during acquisi- 
tion and, among other features, allows for the automatic identifi- 
cation of suitable fields of view (FOVs) ; therefore, maximizing the 
number of cells that can be sampled. For each strain, 20 FOVs, 
containing '1000 cells in total, were automatically selected from 
a user-defined area of the slide. A z-stack was collected at each 
FOV containing 31 focal planes separated by 0.2 p,m, and 2 color 
channels: DyeLight 550 (Chroma filter set SP102vl, exposure 
time: 2 s/plane), and Hoechst 33342 (Chroma filter set 49000, 
exposure time: 150 ms/plane). In addition, a single phase-con- 
trast image was acquired at the central plane for automated cell 
identification (see below) . 
Image processing 

Phase-contrast images were segmented using custom software 
derived from Yeast Tree 1.6. 3. 15 The application relies on the 
MATLAB Image Processing toolbox. First, the function "imfill" 
was used to flood-fill local minima not connected to the image 
border, which fills in the center of the groups of cells. As each 
group of cells has slightly different levels to which the flood- 
fill will rise, we then searched the image histogram for inten- 
sities greater than the calculated background, taken from the 
border pixels, and with a frequency greater than the minimum 
cell area, generally set to 200 pixels. To keep only large groups 
of connected pixels, erosion (built-in function "imerode") was 
performed, removing the outermost pixels of a region and elimi- 
nating small groups of pixels. 

The next step is to separate these groups into individual cells. 
This was done with another call to "imerode" to cut the small 
necks that appear between touching cells. Once the cells are cut, 
the remaining connected regions were labeled with a call to the 
built-in function "bwlabel", which identifies the individual cells 
and assigns each with a unique label. To finish, the cells were 
returned to their original sizes with a dilation (built-in function 
"imdilate"), which adds pixels around the edges of each cell. 

Prior to the identification of spots, the 3-dimensional fluores- 
cence z-stacks were reduced to 2D images by use of a maximum 
z-projection. Single spots, corresponding to mRNA molecules, 
were then identified using the algorithm described. 41 First, a 
local background subtraction, in which the mean of a 19 x 19 
pixel neighborhood is subtracted from the central pixel, was used 
to highlight the bright punctate spots from the autofluorescence 



Table 5. List of oligonucleotides used in this study 



Probe ID 


Sequence 8 


GFP 70 


TCCGTATGTT GCATCACCTT CACCCTCTCC ACTGACAGAA AATTTGTGCCC 


GFP 304 


GGGTATCACC TTCAAACTTG ACTTCAGCAC GTGTCTTGTAG TTCCCGTCATC 


GFP 555 


TGGACAGGTA ATGGTTGTCT GGTAAAAGGA CAGGGCCATC GCCAATTGGA GTA 


GFP 628 


GCAGCTGTTA CAAACTCAAG AAGGACCATG TGGTCTCTCT TTTCGTTGGG ATC 


GFP 682 


TAGAAGTGGC GCGCCCTATT TGTATAGTTC ATCCATGCCA TGTGTAATCTC 



"Nucleotides in bold are modified with amino groups for conjugation to a fluorophore. 
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background of the cells. Then, pixels 5 standard deviations above 
the mean pixel value of the entire image were chosen as initial 
seeds for the spot detection routine. Gaussian weighting was 
then used to move each seed to the center of the 2D Gaussian 
intensity distribution. Spots with an integrated intensity of <350 
are likely the result of non-specifically bound individual probes 
and were therefore removed. Finally, to avoid double counting a 
single mRNA, if the centers of 2 spots were within 2 pixels of 
each other, then the dimmer of the 2 spots was removed. 
Microarray data analysis 

Microarray-derived relative gene expression data sets for each 
cell cycle gene were downloaded from Cyclebase. 46 These data 
sets represent 6 independent time-course microarray experi- 
ments performed using synchronized budding yeast cell cul- 
tures. 44 ' 45 ' 47,48 Because these data sets are from cells synchronized 
at different points in the cell cycle, Cyclebase provides them in a 
form in which all the experiments are time-registered to start at 
the same point in the cell cycle — roughly, late M/G r Since the 
time points were different for the various experiments, we used 
the MATLAB cubic spline interpolation function to obtain high- 
resolution (0.001 min intervals) data sets for each experiment. 
We then obtained the means for 16 common time points (12 min 
intervals — the same as Spellman et al. 48 ) across the 6 data sets 
and performed a cubic spline interpolation (0.001 min intervals) 
of the mean expression levels to plot the mean expression profiles 
over the course of 2 cell cycles (Fig. SI). These mean values were 
used for comparisons to the single mRNA FISH data. 

Statistical analysis 

The transcription of constitutively expressed genes was 
assumed to be the result of a random birth and death process 
with exponentially distributed waiting times. To characterize 
variability between experimental replicates, the following anal- 
yses were conducted on each experimental replicate separately. 
We used maximum-likelihood estimation to compare the distri- 
butions of mRNAs in unsynchronized populations to a single 
component Poisson function, with probability density function 
(PDF) defined as: A 

1 jc! 

where A. is the mean of the Poisson distribution. 

For regulated genes, we assumed that our population includes 
2 groups: one group of cells transcribing the gene at the low rate, 
and the other transcribing at a high rate. We make the further 
assumption that transcription in both groups has reached steady 
state, and so using maximum-likelihood estimation, the distribu- 
tion was compared with a two-component Poisson with PDF: 

(2)> i v -a, X x e Xl 



The goodness of fit test statistic is: 

G = 2j^O i \n(O i /E i ) 

where i = l,...,k indexes the number of observed outcomes, and 
O. and E. represent the number of observed and expected counts 
for each outcome. To satisfy the asymptotic assumptions of this 
test, outcomes which had few counts were binned together, such 
that E. > 5 in all cases. Under the hypothesis that the observed 
data arise from the theoretical distribution, G has an asymptotic 
Chi-square distribution with k - p - 1 degrees of freedom, where 
k is the number of data points, and p is the number of parameters 
estimated from the data (p = 1 for the Poisson model and p = 
3 for the two-component Poisson model). The P value for this 
hypothesis test is calculated as the upper tail of the appropriate 
Chi-square distribution. 

The measure of effect size <$> 2 is: 

1 v (0,-£ ; ) 2 



where X l and \ 2 are the means of the mRNA abundance in the 
2 transcription states, and B is the proportion of cells with the 
lower mean transcript abundance (Xj), 0 < B < 1. 

To determine if the experimental distributions, O(x), were 
adequately described by these distribution functions, we used 
likelihood ratio goodness of fit hypothesis tests, 4> 2 measures of 
effect size, and quantile-quantile plots. 



n i n, t 

where n is the sample size for a given gene. This measure sum- 
marizes the average normalized squared deviation between the 
number of observations in a cell and the number expected under 
a postulated model. Effect size conventions proposed by Cohen 
(1988) were used to classify small, medium, and large effects. 63 

Quantile-quantile (Q— Q) plots (Fig. S2) were created by 
plotting the observed quantiles of the data against the corre- 
sponding theoretical quantiles of the proposed models. If data 
truly arise from the proposed model, then the points on the Q— Q 
plot will adhere closely to the X = Tline. Since these data and the 
corresponding models take non-negative integer values, a small 
amount of random noise was added (also known as jittering) to 
enhance visibility in the Q— Q plots. 

Modeling 

The model presented here is nearly identical to that described 
in Barik et al. 10 with some exceptions noted in the modeling sec- 
tion of the results and listed in Tables S3 and S4. Deterministic 
simulations were performed in MATLAB using the odel5s 
solver. In deterministic simulations, we triggered cell division 
whenever ClbM concentration level drops below 12.5 nM as the 
cell exits mitosis. At cell division, cell volume is distributed to 
mother and daughter cells in a 60:40 ratio, 14 but protein concen- 
trations are unchanged. 

Stochastic simulations were performed using Gillespie SSA. 35 
In stochastic simulations (as in the deterministic case), cell divi- 
sion occurs whenever ClbM concentration drops below 12.5 
nM. To avoid multiple divisions caused by ClbM concentration 
fluctuating back and forth across 12.5 nM, we insist that ClbM 
concentration must increase above 20 nM before a subsequent 
division can be triggered by falling ClbM concentration. We use 
a pseudo-steady-state approximation for activation of the CLBS 
gene, under the assumption that SBF binds to and dissociates 
from the CLBS gene very rapidly. At cell division, we partitioned 
cell volume and all molecular species to the mother and daughter 
cells in a 60:40 ratio, except for Cln3, which was apportioned 
75% to the mother cell and 25% to the daughter cell. This rule 
was intended to account for the fact that CLN3 expression in 
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newborn daughters is 1/3 that of newly divided mother cells due 
to repression of CLN3 by Ace2 and Ashl. 

In stochastic simulations, we followed both mother and 
daughter cells to generate a complete pedigree of growing and 
dividing cells. For each cell, we recorded T G1 = time from birth 
to onset of DNA synthesis (when ClbS rises above 25 nM) and 
T = time from birth to Start (when SBF rises above 15 nM). 
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